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Abstract: Toxicogenomics (TGx) is a widely used technique in the preclinical stage of drug development to investigate 
the molecular mechanisms of toxicity. A number of candidate TGx biomarkers have now been identified and are utilized 
for both assessing and predicting toxicities. Further accumulation of novel TGx biomarkers will lead to more efficient, 
appropriate and cost effective drug risk assessment, reinforcing the paradigm of the conventional toxicology system with 
a more profound understanding of the molecular mechanisms of drug-induced toxicity. In this paper, we overview some 
practical strategies as well as obstacles for identifying and utilizing TGx biomarkers based on microarray analysis. Since 
clinical hepatotoxicity is one of the major causes of drug development attrition, the liver has been the best documented 
target organ for TGx studies to date, and we therefore focused on information from liver TGx studies. In this review, 
we summarize the current resources in the literature in regard to TGx studies of the liver, from which toxicologists could 
extract potential TGx biomarker gene sets for better hepatotoxicity risk assessment. (J Toxicol Pathol 2009; 22: 35- 
52) 
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Introduction 

Although the term "toxicogenomics" (TGx) is 
relatively new, this method is now widely utilized by 
pharmaceutical scientists to investigate the molecular 
mechanisms of toxicity. Although the importance of 
functional genomics has been recognized since the 
emergence of microarray technology 12 , more attention has 
been focused on it since the US Food and Drug 
Administration (FDA) released a whitepaper 3 showing that 
the number of new molecular entities has been decreasing 
since 2000, but that the costs of pharmaceutical companies 
for R&D of drugs have increased dramatically since 1993. 
One of the major attritions in the drug development process 
lies in unexpected adverse effects elicited in the clinical 
phase, and therefore the preclinical toxicological evaluation 
and the clinical trial steps are called 'critical path' of drug 
development in the FDA whitepaper. One estimation 
suggests that a 10% improvement in predicting future 
failure in the clinical phase would save 100 million dollars 
of R&D cost per drug 3 , and the whitepaper emphasized the 
importance of modernizing toxicological methodologies 
by applying cutting-edge techniques such as TGx and other 
"-omics" techniques. 
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One of the goals of TGx research is to identify novel 
biomarkers for evaluating the efficacy or toxicity in either 
clinical or preclinical cases, which would be as useful as 
such conventional biomarkers such as the blood enzyme 
activity of alanine aminotranferase in evaluating liver injury. 
The term 'biomarker' is defined as a characteristic that is 
objectively measured and evaluated as an indicator of 
normal biological process, pathogenic process, or 
pharmacologic responses to a therapeutic intervention* . In 
principle, any biological parameters that are objectively 
measurable and recordable could be potential biomarkers. 
One example of a 'good biomarker' is single nucleotide 
polymorphisms (SNPs) in human CYP2C9 and Vitamin K 
epoxide reductase genes, which are used for optimization of 
the dosing level of warfarin, an anticoagulant drug with a 
great number of serious adverse effects in the US 5 . Such 
biomarkers are not only useful for efficient drug risk 
management but will also lead to the establishment of 
promising markets for pharmaceutical companies. In TGx 
research, the term 'biomarker' does not always refer to a 
single gene, but may consist of sets of genes whose 
expression levels are closely associated with certain 
toxicological endpoints. 

In the TGx research field, the liver has been the 
preferred target organ for the following reasons: i) the 
clinical manifestation of hepatotoxicity is one of the major 
causes of drug development attrition; ii) the exposure level 
of the liver is exceptionally high following drug treatment; 
and iii) it is relatively easy to collect liver samples due to its 
size and homogeneity. In this paper, we outline the literature 
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Table 1. Representative Analytical Methods for Gene Expression Studies 



Method 


Sensitivity 


Specificity 


Throughput 


Notes 


Northern blot 


o 


o 


X 


Size and alternatively-spliced variants of mRNA detectable 
Cross hybridization of the used probe detectable 


RT-PCR 


o 


o 


A 


Need to optimize amplification cycle 


Real-time PCR 








Easy to perform 


SYBR® Green 


© 


© 


A 


Relatively low cost 


TaqMan® 


© 


© 


A 


Relatively expensive 


Microarray 










cDNA array 


o 


X 


O 


Moderately expensive 


Oligonucleotide array 








Highly expensive 


Expression array 


o 


o 


© 


Relatively high specificity 


Exon array 








Detect alternatively-spliced variants 



The meanings of the symbols in the table are as follows: © , Excellent; O , Very good; A , Good; X , Poor. 



resources in regard to candidate TGx biomarkers for liver 
toxicity and overview their significance, and advantages and 
major obstacles in practical use. 

Microarray Technique 

Microarray is the most mature functional genomics 
technique and is now utilized in various fields, including 
pharmacology, toxicology and nutritional science. 
Compared with traditional gene expression analysis 
techniques such as Northern blotting or RT-PCR, microarray 
can measure the expression levels of tens of thousands of 
genes simultaneously, and accordingly, the data acquisition 
is considerably high-throughput (Table 1). In a microarray 
analysis, target samples (i.e., mRNA, cRNA or cDNA) are 
labeled with fluorescent dyes (i.e., Cy3, Cy5, phycoerythrin, 
etc.) of either one or two colors. The microarray probes 
consist of either cDNA or oligonucleotide and are hybridized 
with labeled target samples which have complementary 
nucleotide sequences. 

Although microarrays can be manufactured in a lab 
using specific instruments, a number of microarray 
platforms are now commercially available, including 
GeneChip (Affymetrix, Inc.), Illumina (Illumina, Inc.), 
Codelink (GE Healthcare) and Agilent oligonucleotide 
arrays (Agilent Inc.). Each microarray platform has its own 
advantages and disadvantages. For example, in the Agilent 
2-color (Cy3 and Cy5 dyes) microarray system, the Cy5 dye 
is extremely ozone-sensitive, and its signal is rapidly 
weakened under a high concentration of ozone 6 , which 
results in extremely poor data quality. On the other hand, the 
Affymetrix GeneChip system requires specific instruments, 
and therefore the initial investment is quite high, while the 
cost of preparing a cDNA microarray in-house is relatively 
low, provided the cDNA clones and spotting instrument are 
available. Organizing and maintaining DNA clones, 
however, are tedious and error-prone procedures that can 
easily lead to confusion, and the reliability of the obtained 
data may sometimes be questionable. On the other hand, 
commercial microarrays usually provide specified kits that 



contain the entire reagent necessary for all the experimental 
processes and, in some cases, are even equipped with 
specialized instruments to automate tedious work such as 
washing and staining the microarrays after hybridization. 
Therefore, commercial microarrays are generally preferred 
by pharmaceutical researchers because they regard these 
advantages to be more cost-effective in the long term. 

Finding Differentially Expressed Genes 

Microarray fluorescence is detected with a scanner after 
washing the microarray after hybridization with labeled 
target samples. After scanning the microarray fluorescence 
signals, the scanned microarray image is subjected to 
gridding and assignment of predefined probe information 
using image analysis software such as GenePix Pro 
(Molecular Devices). Usually, this step is performed 
manually, and it is therefore a tedious procedure. 

In the Affymetrix GeneChip system, this process is 
highly automated and easy to complete. After completion of 
the gridding, the image data with the fluorescent signals are 
converted into numerical data followed by background 
subtraction to correct any undesirable bias of the individual 
data derived from the experimental conditions, sample, 
manufacturing variability or other factors. A set of probes 
comprised of two types of probe per gene are designed in the 
GeneChip system, namely the Perfect Match (PM) and 
Mismatch (MM) probes (typically 1 1 MM and 1 1 PM probes 
that are 25-bp nucleotides in length) per gene. The PM 
probe sequence is complimentary to that of the target gene, 
while the MM probe sequence contains one mutated 
sequence in the middle of the 25 bp sequence, and the MM 
probe is used to estimate non-specific bindings to the PM 
sequence. Since multiple probes are designed for one gene, 
one needs to evaluate the expression level of the gene by 
summarizing multiple probe data sets. A number of 
analytical algorithms have been proposed for this 
"summarization" of the probe level data, including MAS5, 
dChip, RMA and GCRMA 7 . MAS5 is a 'chip-by-chip' 
summarization algorithm, while dChip, RMA and GCRMA 
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Table 2. Representative Multivariate Analysis Methods 



Methods 



Advantage 



Disadvantage 



Unsupervised 

Hierarchical clustering 
K-means clustering 
Self-organization map 
Principal component analysis 

Supervised 

K nearest neighbors (KNN) 
Support vector machine (SVM) 
Prediction analysis of microarray (PAM) 



Capture the trend of gene expression profiles 
easilly without losing quantitative information 

Easy to interpret the result because of highly 
reduced data dimension 



The output result is conclusive and clear 



The output result is usually inconclusive 
and unclear 

May lose significant information during 
reduction of data dimensions 



Inappropriate training data set will generate 
a poor-performance discriminator 



are 'model-based' or 'project-based' summarization 
algorithms that require relatively high performance 
computers to perform the calculations. In general, the 
project -based summarization algorithm yields better quality 
datasets in terms of sensitivity and reproducibility. On the 
other hand, MAS5 calculations are easy to compute, and 
there is no need to perform recalculations on whole data sets 
when new GeneChip data is added to a project. Thus, there 
is a trade-off in terms of the pros and cons of each method. 

After correction of the individual data biases, the 
numerical data is subjected to normalization so that one can 
perform a comparative analysis among the microarray data 
sets. The easiest normalization is to adjust the global signal 
scale of each set of microarray data (global normalization), 
usually by setting it to the mean or median of the total signal 
data set. Another method is to use external spikes to get a 
standard curve, such as 'Percellome normalization' 8 , in order 
to quantify the mRNA levels. This normalization method 
has been shown to be effective when the gene expression 
changes are extreme, such as in a uterotrophic response 
following activation of estrogen receptor or in an in vitro 
system using a primary cell culture. 

After the normalization, one needs to identify the 
differentially expressed genes in the chemical-treated group. 
Since microarray analysis measures the expression levels of 
a large number of genes simultaneously, a straightforward 
pair-wise test, such as a Mest, would yield a considerable 
number of false-positives. (For example, if we set the 
significance level as P < 0.01 for Rat 230 2.0 GeneChip data 
consisting of > 30,000 probe sets, we may detect more than 
300 positives just by chance). To prevent this multiple 
testing problem, P-value correction may be performed using 
False Discovery Rate 9 , or two individual filtering criterions 
like fold change and ?-value can be used in combination. A 
number of filtering methods are provided in the literature, 
such as significance analysis of microarrays (SAM) 10 , and 
there are a great number of sophisticated algorithms 
available as library files on the BioConductor project 
website (http://www.bioconductor.org/) 11 that can be 
implemented via the open source statistical software R 
(http://www.R-project.org). 



Multivariate Analysis 

Since microarray data consist of large amounts of 
numerical data, statistical knowledge, computational skills 
and infrastructure are required to interpret the results. 
Multivariate analysis methods are utilized for both data 
mining and pattern recognition (Table 2). "Unsupervised" 
multivariate analysis includes hierarchical clustering 12 , K- 
means clustering 12 , self-organizing map (SOM) 12 and 
principal component analysis (PCA) 13 . "Supervised" 
multivariate analysis, or discriminant analysis, includes 
Support Vector Machine (SVM) 14 , K-Nearest Neighbors 
(KNN) 15 and Prediction Analysis of Microarray (PAM) 16 . In 
general, each biomarker gene set requires its own specific 
analytical method based on the objective and manner of gene 
set identification. 

Eisen et al. applied hierarchical clustering to visualize 
the trend of gene expression profiles 17 , and since then the 
hierarchical clustering method has been widely preferred by 
toxicologists when interpreting microarray data. In the case 
of K-means clustering and SOM, one needs to specify the 
number of clusters to be formed before the calculation. PCA 
is utilized to reduce the dimensions of the microarray data 
into 2 or 3; this makes it much easier to recognize the gene 
expression pattern. 

Discriminant analysis, such as SVM, KNN and PAM, is 
an application of machine-learning algorithms and is 
frequently used for toxicity prediction based on microarray 
data. The sample size and appropriate selection of the 
training data set are crucial for establishing reliable 
classifiers. This type of discriminant analysis is also applied 
to quality control of microarray data 18 . 

As described above, microarray analysis consists of 
multiple steps from in vivo I in vitro studies to microarray 
data interpretation (Fig. 1), and each step includes specific 
points to be considered in order to avoid misinterpretation of 
the obtained results. 

Literature Resources for TGx Biomarkers in 
Regard to Liver Toxicity 

The reports in the literature related to liver toxicity- 
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• Conventional toxicology ■ 




Required issues 



» SOP 

« Study design 



■SOP 

■ Microarray platform 

■ Standardization of 

recording items 

■ Set up QC parameter 



■ Design / development of 

database system 

■ Software 

■ Database compatibility 

■ Statistics 

■ Biomarker selection 

■ Computational modeling 



' Multi / interdisciplinary 
knowledge / skill 



■ Report format 

■ Contents 



Fig. 1. General flow of a TGx study. The general flow of a TGx study is presented. Conventional toxicologic parameters, such as body / organ 
weights, histopathological findings, blood chemistry and toxico / pharmacokinetics, and functional genomics information, such as 
microarray data, are collected. The genomics data sets are huge and need to be organized into a well-designed database. Interpretation of 
the genomics data depends on the quality of the database, and analytical tools and an experienced researchers' interdisciplinary 
knowledge and skills in biology, toxicology, statistics and computational sciences. A number of issues are yet to be determined to 
establish a standard operating procedure (SOP) for the public, including the content / format of the final report, recording items, statistical 
analysis to be performed for genomics data, etc. All the information should be appropriately recorded so that the obtained TGx data can 
be exchangeable across laboratories. 



relevant gene sets obtained from TGx studies are 
summarized in Table 3. A great number of TGx studies of 
the liver have been reported using various animal models, 
such as rats, mice, humans, monkeys and canines, and these 
studies contain a number of toxicity -relevant gene sets that 
could be potential TGx biomarkers for assessing/predicting 
liver toxicity. 

Hepatotoxicity animal models using prototypical 
toxicants such as acetaminophen or carbon tetrachloride 
have been widely tested in TGx studies, and a number of 
gene sets associated with liver injury have been reported. 
Since these gene sets consist of a mixture of primary 
responses associated with cell death as well as secondary or 
more downstream responses such as inflammation caused by 
Kupffer cells or infiltrated lympocytes, one needs to dissect 
the stimulated biological pathways carefully to interpret the 
biological significance associated with gene expression 
changes. 

Waring et ol. reported that the hepatic gene expression 
profiles in rats following treatments with various chemicals 
showed clear chemical-specific patterns 19 . Based on this 
result, one can assume that such chemical-specific changes 
in the transcriptome profile would lead to changes in the 
proteome profile, the metabolome profile and eventually the 
histopathological phenotypes at later time points. This 
concept led toxicologists to expect that one might be able to 
utilize microarray data to predict later histopathological 
changes that are not detectable at earlier time points. As 



stated previously, such chemical-specific gene expression 
profiles, or 'chemical fingerprints', contain mixed molecular 
events that result from complicated interactions between 
biological pathways, such as xenobiotic metabolism, stress 
response, energy metabolism, protein synthesis / 
degradation, mRNA transcription / degradation, DNA repair 
/ replication and cell growth / cell death control. By 
comparison with data for prototypical chemicals whose 
molecular mechanisms of toxicity have been well 
investigated, one may be able to identify the key gene sets, or 
TGx biomarkers whose expression levels are highly 
associated with specific toxicological events, by dissecting 
the specific molecular pathway from the mixed molecular 
events. These TGx biomarkers can then be utilized for the 
evaluation, diagnosis or prediction of toxicity based on their 
expression changes. For example, carcinogenicity tests in 
the preclinical stage of drug development require highly 
time- and labor-consuming tasks, and thus the identification 
of TGx biomarker genes for carcinogenicity prediction 
would dramatically reduce R&D time and costs for 
pharmaceutical companies. 

Utilization of TGx Biomarkers 

One of the practical applications of TGx biomarkers is 
to prioritize the drug candidates according to their toxicity 
profiles based on microarray data. An example is presented 
in Fig. 2 in which six TGx biomarkers for assessing the 
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Table 3. TGx Biomarkers for Liver Toxicity 



K fw*n cpn tnviYi t\i 
JTUULlaCU LUAlAlLy 

(tissue or cultured cells) 


Species 


Reference 


Gene expression signature 


Rat 


19-32 


Drug metabolizing enzymes 


Rat 


33-37 


Cell injury (multiple mechanisms) 


Rat 

IVIoi lse 

Human 


38-58 
48 59-63 
39, 64 - 67 


Carcinogenicity 


Rat 
Mouse 

1V1\J LI 

Human 


68-82 
83 - 92 
93,94 


Steatosis / fattv liver 

iJl^tilU JliJ / 1CILLV 11 V V^l 


Rat 

Mouse 

Human 


95 - 97 
98-101 
102, 103 


Oxidative stress 


Rat 

Mouse 

Human 


104 

105-108 
109 


PriosnTinlinirlnsi s 


Rat 

Human 


1 10 

111-112 


Glutathione depletion 


Rat 

Mouse 

Canine 


113-115 

115 

115 


Fibrosis 


Rat 

Mouse 

Human 


116-122 
123 - 125 
126-132 


ER stress 


Rat 

Mouse 

Human 


133 

134, 135 
136 


lvlitnerinnHrial fiinetinn 

1VX1 Lvy^l 1V-J1 IL-ll 1 CI 1 1 Lill^ LI yjll 


Rat 

Mouse 

Human 


137 

138-140 
137 


PPARrz-meHiated resnonse 


Rat 

Mouse 

Canine 


76 141 - 145 

146-148 

144 


Fstropen rerentor sionalirtP 


Rat 
Mouse 


76, 149 - 151 
152 


AhR signalling 


Rat 

Mouse 

Human 


153-157 
156, 158- 162 
163 


Immune-related response 


Rat 

Mouse 

Canine 


164, 165 

166-168 

169 


Anemia 


Rat 


170 


Transporters 


Rat, Mouse, 
Human, Monkey, 171 
Canine 


Baseline gene expression information Rat 


172, 173 



Abbreviations: ER, endoplasmic reticulum; PPAR, peroxisome 
proliferator-activated receptor; Ahr, aryl hydrocarbon receptor. 



induction of drug metabolizing enzymes, PPARa activation, 
cell proliferation, glutathione depletion, inflammation or 
oxidative stress were used to evaluate chemical-induced 
toxicities in the rat liver. The general trend of the gene 
expression changes in each biomarker gene set was 
estimated using the TGP1 score 174 . The TGP1 score profile 
for each chemical is visualized by hierarchical clustering in 
Fig. 2, which demonstrates that each chemical shows 
characteristic changes in their gene expression levels that are 
associated with specific toxicity endpoints. Ideally, 
chemicals showing weaker effects in all the toxicity 
categories would be promising drug candidates. 

In Fig. 3, a model case is presented for identifying a 
candidate TGx biomarker gene set associated with 
glutathione depletion, which is known to play a crucial role 
in acetaminophen (APAP)-type liver injury 175 . Male F344 
rats were treated with the glutathione depletor L-buthionine 
(S, R)-sulfoximine (BSO), and microarray analysis was 
conducted on the liver using RG U34A GeneChip. A total of 
69 probe sets were identified with signal levels that were 
inversely correlated with the hepatic glutathione content 
(Fig. 3A). The validity of the gene set was tested using time- 
course microarray data for rat livers treated with APAP. As 
demonstrated in Fig. 3B, 69 probe sets clearly classified the 
animal groups following APAP treatment and showed that 
the 24 h APAP group was clustered together with the BSO- 
treated rats 113 ; this indicates that the gene expression profiles 
of the APAP-treated (24 h) and BSO-treated rats are very 
similar and therefore that the 69 gene sets used are associated 
with glutathione depletion. In another experiment, more 
detailed TGx data were collected using another the 
glutathione depleting agent phorone 1 14 , and the results of that 
experiment showed that the 'glutathione depletion- 
responsive genes' maintain a high expression level even 
after the hepatic glutathione content recovered from acute 
glutathione depletion immediately after the phorone 
treatment. Accordingly, it may be better to call these genes 
'glutathione homeostasis-associated genes' rather than 
'glutathione depletion-responsive genes' in order to prevent 
misinterpretation of the microarray results. 

Although hierarchical clustering (Fig. 2) and PCA (Fig. 
3) are easy to implement, the obtained results are sometimes 
not conclusive, and the interpretation of the results requires a 
certain level of proficiency. On the other hand, discriminant 
analysis, such as SVM, generates conclusive results, such as 
'toxic' or 'non-toxic'. The general procedure for SVM 
analysis is presented in Fig. 4. The first step is to prepare 
training data sets, such as microarray data for "carcinogenic 
compounds (positive)" and "non-carcinogenic compounds 
(negative)". Next, one develops a 'classifier' with these 
training data sets using the machine learning algorithm of 
SVM. Once the classifier is developed, a positive / negative 
outcome can be predicted for a test compound with a known 
toxicological profile. Although the results produced by a 
discriminant analysis are conclusive, they are not reliable if 
the training sets are not selected properly. Furthermore, 
even when the cross-validation of the established classifier 
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Time points (h) 



= 



JS 

u 




Decrease Increase 



Gene expression change 



Enzyme inducers 
PPARa activators 

Enzyme iuduction / glutathione 
depletion / oxidative stress - 
inducers 



Liver injury / inflammation - 
inducers (higher oxidative stress) 



Liver injury / inflammation - 
inducers (lower oxidative stress) 



Gene set 
(Biomarker) 



(a) Phase I DME, (b) PPARa-regulated, (c) cell proliferation 
(d) Glutathione depletion, (e) inflammation, (f) oxidative stress 



Fig. 2. Characterization of hepatic toxicity profile. An example of characterizing the hepatic toxicity profile is presented. In this figure, six TGx 
biomarker gene sets associated with a) phase I drug metabolizing enzyme (DME), b) PPARa-regulated genes, c) cell proliferation, d) 
glutathione depletion, e) inflammation and f) oxidative stress are used to assess toxicity profiles based on the microarray data for rat livers 
treated with one of 90 chemicals. The microarray data was retrieved from TG-GATEs, a TGx database developed by the Toxicogenomics 
Project in Japan (TGP), after obtaining permission. The expression changes for each biomarker set were summarized and estimated using 
the TGP1 score 174 , and the TGP1 score was subjected to hierarchical clustering. The red and blue colors indicate that the genes included 
in the TGx biomarker were generally up- or down-regulated, respectively, and the black color indicates that the expression level of the 
TGx biomarker gene sets did not show characteristic changes as a whole. Ideally, chemicals that do not affect the expression levels of 
genes included in the TGx biomarker would be desirable drug candidates. This strategy is applied to rank the chemicals based on the 
toxicity profiling. 




-60 -50 -40 -30 -20 -10 0 10 20 PCA 1 



Fig. 3. Identification and application of TGx biomarkers for assessing glutathione depletion. A model case for identifying the candidate TGx 
biomarkers associated with glutathione depletion-type (acetaminophen-type) liver injury is presented. Rats were treated with a 
glutathione depletor L-buthionine (S, R)-sulfoximine (BSO), and GeneChip analysis was conducted on the liver. (A) A total of 69 probe 
sets were identified whose signal values were inversely correlated with the hepatic glutathione content. (B) The validity of the 69 probe 
sets as candidate TGx biomarkers for evaluation of glutathione depletion was evaluated by PCA using time-course microarray data for rat 
livers treated with acetaminophen. The 69 probe sets clearly classified the animal groups following acetaminophen treatment, and the 
acetaminophen group was clustered for 24 h together with the BSO-treated rats, suggesting that glutathione homeostasis was highly 
affected at this time point. Reprinted from Reference" 3 , with permission from Elsevier. 
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certifies good performance for the training data sets used, it 
may not work if the test compound induces a toxicity whose 
mechanism is rare or new and has not been considered in the 
training data sets. For all these reasons, the classifiers 
should be continuously updated to improve the classification 
performance. 

Microarray Database for TGx Research 

To interpret the microarray data appropriately, it is 
desirable to perform comparative analysis with data 
obtained from prototypical toxicants. Developing a large- 
scale reference database, however, is not easy to 
accomplish, and therefore public databases, such as Gene 
Expression Ominibus (GEO) 176 , ArrayExpress 177 , Chemical 
Effects in Biological Systems (CEBS) 178 , Comparative 
Toxicogenomics Database (CTD) 179 or EDGE 180 , can be 
used to obtain reference microarray data. In addition to 
public microarray databases, large-scale TGx databases 
have been developed by collaborative consortiums such as 
the Toxicogenomics Project in Japan (http:// 
wwwtgp.nibio.go.jp/index.html) and the InnoMed PredTox 
Consortium (http://www.innomed-predtox.com/), both of 
which contain microarray datasets for prototypical 
chemicals as well as proprietary drugs using both in vivo and 
in vitro systems. Animal and study information as well as 
microarray data can be retrieved from such databases 
provided that the TGx datasets were submitted with 
MIAME-compliant information, a guideline proposed by 
the Microarray and the Gene Expression Data (MGED) 
Society 181 to facilitate microarray data sharing. Recently, a 
number of major scientific journals have begun to require 
investigators to deposit MIAME-compliant study 
information as well as microarray datasets at the time of or 
prior to the submission of manuscripts to their respective 
journals. This trend will continue because one cannot 
interpret microarray data appropriately without detailed 
study information. 

Consistency of Microarray Data 

Concerns have been raised regarding the 
reproducibility of microarray datasets across laboratories 
and microarray platforms. Some papers have reported about 
the inconsistency of interlaboratory / inter-platform 
microarray results 182,183 , while others have reported good 
concordance among laboratories 184186 or inconclusive 
results for this 48 187 . In addition to such laboratory-specific 
biases, a number of factors cause fluctuations in baseline 
animal data, such as gender, organ section, strain and fasting 
state before chemical dosing 173 . Furthermore, the vehicle 
substance used for animal dosing affects the baseline gene 
expression profile 172 , and therefore it is not appropriate to 
analyze the microarray data sets directly without 
consideration of the animal study conditions. In this sense, 
even the MIAME guidelines may not be sufficient for 
standardizing the TGx study conditions, and additional 



practical standards may be required to overcome this 
problem 188 . 

Even within the same GeneChip platform, the baseline 
microarray data fluctuates among laboratories. This 
inconsistency of microarray data is evident among the 
different generations of rat GeneChips, namely the RG 
U34A and RAE 230A arrays (Fig. 5A). Practically, we may 
avoid such inconsistency between two sets of array data by 
adjusting the median of the signal value between the two 
datasets (Fig. 5B) 189 , and 'legacy TGx datasets' can thereby 
be used together with new datasets. 

The MicroArray Quality Control (MAQC) Consortium 
performed a detailed data comparison in regard to inter / 
intra-platform microarrays across several laboratories and 
reported that microarray data shows generally high 
interlaboratory and inter-platform compatibility if fold- 
change ranking plus a less stringent statistical cutoff (such as 
a Mest) are used to filter the criteria, provided that the 
expression levels of the filtered genes are relatively high 190 . 
However, other reports have pointed out that the analytical 
procedure in the MAQC report was inadequate, and 
therefore the conclusion drawn is questionable 191 . In 
general, however, the reproducibility of interlaboratory 
microarray data tends to be high when the genes are filtered 
by fold-change values 192 rather than by stringent P-values in 
the statistical analysis. 

Species Difference Issues 

Because experimental animals are used in preclinical 
toxicology studies, species differences are always major 
concerns. A number of papers have reported significant 
species-specific responses against chemical treatments, 
even among the rodents. For instance, l,4-bis-[2-(3,5,- 
dichloropyridyloxy)] benzene (TCPOBOP) acts as a potent 
phenobarbital-type enzyme inducer in mouse liver but not in 
the rat or human liver. This species-specific response is 
associated with the substitution of Thr350 in the mouse 
constitutive androstane receptor (CAR), a nuclear receptor 
activated by TCPOBOP, with Met in rat and human 
CAR 193195 . On the other hand, the phenobarbital-type 
enzyme inducer 2,4,6-triphenyldioxane-l,3 induces hepatic 
CYP2B in rats but not in mice 196 . Since CAR regulates 
hepatic drug metabolism enzymes and transporters 197 , such 
differential regulation may affect these dramatic species 
differences in drug metabolism and disposition. 

In the case of the estrogenic environmental contaminant 
o,/?'-DDT, hepatic Cypl7al is preferentially upregulated in 
mice 198 but not in rats 199 , even though the majority of 
orthologous genes exhibit similar gene expression profiles in 
mice and rats following o,p '-DDT treatment (Fig. 6). Since 
CYP17A1 is one of the key steroidogenic enzymes, the 
mouse-specific upregulation of Cypllal may alter 
endocrine sex hormone homeostasis. As expected, the blood 
level of DHEA-S, a precursor of sex hormones produced by 
CYP17A1, is elevated only in mice 198 , and this may lead to 
endocrine perturbation in addition to the direct estrogenic 
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Toxicity prediction by Support Vector Machine algorithm. 
Support Vector Machine is a popular discriminant analysis 
algorithm. The first step in this algorithm is to prepare a 
training data set, such as microarray data for a "carcinogenic 
compound (positive)" and "non-carcinogenic compound 
(negative)". Next, a classifier is developed with the training 
data using the machine learning algorithm. By using the 
developed classifier, one can predict a positive / negative 
outcome (carcinogenic / non-carcinogenic outcome in the 
figure) for a test compound with an unknown toxicological 
profile. The accuracy of the prediction by the classifier can be 
estimated by cross-validation using the training data set. Gray 
and green indicate 'Positive' and 'Negative' classification 
areas, respectively. Red spots indicate the support vectors 
used for the classification of the test data set. 
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Fig. 5. Overcoming the discrepancy between old and new GeneChip data. Even within the same GeneChip platform, the inconsistency in 
microarray data is evident among the different generations of rat GeneChips, namely RG U34A and RAE 230A arrays, and this hinders 
utilization of 'legacy TGx knowledge' obtained from older microarrays. (A) The median signal values of the vehicle-treated rats were 
adjusted between the RG U34A and RAE 230A GeneChip data. The results for 4 representative genes are presented. (B) Principal 
component analysis using baseline-corrected RG U34A and RAE 230A GeneChip data was performed using the glutathione depletion- 
associated genes presented in Fig. 3. Adjustment of the baseline signal levels considerably improved the data compatibility between the 
RG U34A and RAE 230A GeneChip data; the spots for each treated chemical moved closer together (cf. inside area of the dashed circles). 
Reprinted from Reference 189 , with permission from Elsevier. 
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Fig. 6. Species-specific regulation of the hepatic Cypl 7al gene elicited by o,p '-DDT. Correlation analysis between mice and rats was performed 
using differentially expressed orthologous genes in the liver elicited by o,p '-DDT. The temporal profiles of the o,p '-DDT-treated mouse 
liveiT98 and those of the o,p '-DDT-treated rat liver 199 were compared by determining the Pearson's correlation of the temporal gene 
expression (fold change) and significance ( p 1 [ f] value by empirical Bayesian analysis) between orthologs, and the results of this 
comparison are presented as a scatter plot. Correlations of gene expression and significance approaching 1.0 indicate that the behaviors of 
the orthologous genes are similar and would fall within the upper right quadrant. (A) Orthologs tended to localize in the upper- or lower- 
right quadrants, indicating that the temporal gene expression changes for o,p' -DDT-treated mouse and rat liver are comparable. However, 
poor correlations between the temporal pl(0 values and gene expression fold changes would fall within the lower left quadrant. Cypl7al, 
one of the poor-correlation genes, fell into this quadrant, suggesting that significant differences exist between the rat and mouse othologue 
expression profiles. (B) The hepatic Cypl7al gene expression levels following o,p '-DDT treatment were compared between rats and 
mice by QRT-PCR. Significant species-specific regulation of hepatic CYPllal gene was observed. * P < 0.05 by a two-way ANOVA 
followed by pairwise comparisons using Tukey's test. 



activity of o,p '-DDT. Furthermore, the hepatic CAR mRNA 
level is decreased in mice but is increased in rats 199 , and this 
could result in differential xenobiotic metabolism and 
disposition in the liver, considering CAR's role in regulating 
cassettes of hepatic drug metabolizing enzymes. Thus, 
marked species differences in hepatic response against 
chemical treatment have been observed even among rodents, 
and these phenomena confound the extrapolation of toxicity 
data from animals to humans. Nevertheless, the 
identification of potential modes of action as well as species- 
specific responses may assist in the development or selection 
of more appropriate models for assessing the toxicity of 
xenobiotics. 

Future Perspectives 

As the number of TGx biomarkers rapidly increases, 
some of them will be promising biomarkers that will lead to 
better understanding of the molecular mechanisms and 
prediction of toxicity in humans based on preclinical data. 
However, many of the candidate TGx biomarkers are 
applicable only to animals, and their feasibility as clinical 
biomarkers remains unclear. Idiosyncratic drug-induced 
hepatotoxicity 200 , which is not detectable in conventional 
preclinical toxicity studies, is one of the major causes of 
failure in drug development after the onset of clinical trials, 
and therefore novel TGx biomarkers which can detect signs 
of idiosyncratic hepatotoxicity are eagerly awaited. 

Recently, seven new renal toxicity biomarkers, 



including Kim-1, /72-microglobulin and Cystatin C, were 
officially qualified for particular uses in regulatory decision- 
making by the US FDA and European Medicines Agency 
(EMEA) 201 . These biomarkers were submitted by the 
Predictive Safety Testing Consortium (PSTC) led by the 
non-profit Critical Path Institute (C-Path; http://www.c- 
path.org/). In addition to these novel renal biomarkers, TGx 
biomarkers for hepatotoxicity will need a similar 
qualification (or validation) process through collaborative 
research like that of C-Path. 

Identification of TGx biomarkers may lead to the 
discovery of other biomarkers (genes, proteins or 
metabolites), the detection of which is easier than measuring 
hepatic mRNA levels. For example, renal Kim-1 gene 
expression is upregulated in response to renal injury 202 , and 
therefore the Kim-1 mRNA level can be a renal toxicity 
biomarker. However, Kim-1 protein is also detectable in 
urine 203 , and thus the urine Kim-1 protein is a much more 
convenient biomarker to measure compared with the renal 
Kim-1 mRNA level. As well, new surrogate hepatotoxicity 
biomarkers, which are more convenient to detect than 
hepatic mRNA, could be discovered through a profound 
understanding of the molecular mechanisms of toxicity by 
utilizing TGx mRNA biomarkers. 'Ideal' TGx biomarkers 
for hepatotoxicity will be those that are sensitive, specific, 
predictive and, above all, 'extrapolatable' to humans, and it 
is the responsibility of pharmaceutical toxicologists to 
discover/establish novel biomarkers to assist in the 
improvement of risk assessment in humans. 
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